Divergence estimation for multidimensional densities via k-nearest-neighbor distances

نویسندگان

  • Qing Wang
  • Sanjeev R. Kulkarni
  • Sergio Verdú
چکیده

A new universal estimator of divergence is presented for multidimensional continuous densities based on -nearest-neighbor ( -NN) distances. Assuming independent and identically distributed (i.i.d.) samples, the new estimator is proved to be asymptotically unbiased and mean-square consistent. In experiments with high-dimensional data, the -NN approach generally exhibits faster convergence than previous algorithms. It is also shown that the speed of convergence of the -NN method can be further improved by an adaptive choice of .

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bias Reduction and Metric Learning for Nearest-Neighbor Estimation of Kullback-Leibler Divergence

Asymptotically unbiased nearest-neighbor estimators for KL divergence have recently been proposed and demonstrated in a number of applications. With small sample sizes, however, these nonparametric methods typically suffer from high estimation bias due to the non-local statistics of empirical nearest-neighbor information. In this paper, we show that this non-local bias can be mitigated by chang...

متن کامل

Fast Parallel Estimation of High Dimensional Information Theoretical Quantities with Low Dimensional Random Projection Ensembles

Goal: estimation of high dimensional information theoretical quantities (entropy, mutual information, divergence). • Problem: computation/estimation is quite slow. • Consistent estimation is possible by nearest neighbor (NN) methods [1] → pairwise distances of sample points: – expensive in high dimensions [2], – approximate isometric embedding into low dimension is possible (Johnson-Lindenstrau...

متن کامل

On the Estimation of alpha-Divergences

We propose new nonparametric, consistent Rényi-α and Tsallis-α divergence estimators for continuous distributions. Given two independent and identically distributed samples, a “naïve” approach would be to simply estimate the underlying densities and plug the estimated densities into the corresponding formulas. Our proposed estimators, in contrast, avoid density estimation completely, estimating...

متن کامل

Software Cost Estimation by a New Hybrid Model of Particle Swarm Optimization and K-Nearest Neighbor Algorithms

A successful software should be finalized with determined and predetermined cost and time. Software is a production which its approximate cost is expert workforce and professionals. The most important and approximate software cost estimation (SCE) is related to the trained workforce. Creative nature of software projects and its abstract nature make extremely cost and time of projects difficult ...

متن کامل

k-Nearest Neighbor Based Consistent Entropy Estimation for Hyperspherical Distributions

A consistent entropy estimator for hyperspherical data is proposed based on the k-nearest neighbor (knn) approach. The asymptotic unbiasedness and consistency of the estimator are proved. Moreover, cross entropy and Kullback-Leibler (KL) divergence estimators are also discussed. Simulation studies are conducted to assess the performance of the estimators for models including uniform and von Mis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Information Theory

دوره 55  شماره 

صفحات  -

تاریخ انتشار 2009